Speech recognition performance comparison between DSR and AMR transcoded speech

نویسندگان

  • Holly Kelleher
  • David Pearce
  • Douglas Ealey
  • Laurent Mauuary
چکیده

In this paper the speech recognition performance obtained when using Distributed Speech Recognition (DSR) architecture is compared to that obtained when the speech is first transcoded using the Adaptive Multi-Rate (AMR) speech codec at 4.75 and 12.2 kbps. In a like-versus-like comparison, made using the Advanced DSR Front-end and the Aurora reference back-end, the DSR architecture gives substantial gains in speech recognition performance. The evaluations measure the change in Word Error Rate (WER) on the Aurora 2 and Aurora 3 databases with “perfect” endpoints. The performance with AMR 4.75 is 50% worse than DSR on Aurora 2 and 47% worse on Aurora 3. Even with the higher data rate of AMR 12.2, AMR is 17% worse than DSR on Aurora 2 and 20% worse on Aurora 3.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Multi - Rate Wir

Distributed speech recognition (DSR) is motivated by the fact that codecs used in speech transmission usually reveal a degrading voice quality below some channel quality (carrier-to-interferer ratio C/I), which justifies efficient coding of features with an appropriate channel coding in the mobile terminal. The Adaptive MultiRate (AMR) speech codec standardized for GSM and UMTS however delivers...

متن کامل

Statistical Tests for Voice Activity Detection

A robust and effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The approach is based on filtering the input channel to avoid high energy noisy components and then the determination of the speech/non-speech bispectra by means of third order autocumulants. This algorithm differs from many others in the way the decisi...

متن کامل

Bispectra Analysis-Based VAD for Robust Speech Recognition

A robust and effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The approach is based on filtering the input channel to avoid high energy noisy components and then the determination of the speech/non-speech bispectra by means of third order autocumulants. This algorithm differs from many others in the way the decisi...

متن کامل

Bispectrum-Based Statistical Tests for VAD

In this paper we propose a voice activity detection (VAD) algorithm for improving speech recognition performance in noisy environments. The approach is based on statistical tests applied to multiple observation window based on the determination of the speech/non-speech bispectra by means of third order auto-cumulants. This algorithm differs from many others in the way the decision rule is formu...

متن کامل

Independent Component Analysis Applied to Voice Activity Detection

In this paper we present the first application of Independent Component Analysis (ICA) to Voice Activity Detection (VAD). The accuracy of a multiple observation-likelihood ratio test (MO-LRT) VAD is improved by transforming the set of observations to a new set of independent components. Clear improvements in speech/non-speech discrimination accuracy for low false alarm rate demonstrate the effe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002